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Abstract 

We consider three types of multivariate records in this paper and derive the mean and 
the variance of their numbers for independent and uniform random samples from two pro- 
totype regions: hypercubes [0, l]'^ and d-dimensional simplex. Central limit theorems with 
convergence rates are established when the variance tends to infinity. Effective numeri- 
cal procedures are also provided for computing the variance constants to high degree of 
precision. 

1 Introduction 

While the one-dimensional records (or record-breakings, left-to-right maxima, outstanding el- 
ements, etc.) of a given sample have been the subject of research and development for more 
than six decades, considerably less is known for multidimensional records. One simple reason 
being that there is no total ordering for multivariate data, implying no unique way of defining 
records in higher dimensions. We study in this paper the stochastic properties of three types of 
records based on the dominance relation under two representative prototype models. In partic- 
ular, central limit theorems with convergence rates are proved for the number of multivariate 
records when the variance tends to infinity, the major difficulty being the asymptotics of the 
variance. 

Dominance and maxima. A point p G is said to dominate another point q G M"^ if 
p — q has only positive coordinates, where the dimensionality d> \. Write q -< p or p q. 
The nondominated points in the set {pi, . . . , p„} are called maxima. Maxima represent one of 
the most natural and widely used partial orders for multidimensional samples when d > 2, and 
have been thoroughly investigated in the literature under many different guises and names (such 
as admissibility, Pareto optimality, elites, efficiency, skylines, . . . ); see [1,4] and the references 
therein. 
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Pareto records. A point is defined to be a Pareto record or a nondominated record of the 
sequence pi, . . . , p„ if 

Pfc 7^ Pi for all 1 < z < fc. 

Such a record is referred to as a weak record in [14], but we found this term less informative. 

In addition to being one of the natural extensions of the classical one-dimensional records, 
the Pareto records of a sequence of points are also closely connected to maxima, the simplest 
connection being the following bijection. If we consider the indices of the points as an addi- 
tional coordinate, then the Pareto records are exactly the maxima in the extended space (the 
original one and the index-set) by reversing the order of the indices. Conversely, if we sort a 
set of points according to a fixed coordinate and use the ranks as the indices, then the maxima 
are nothing but the Pareto records in the induced space (with one dimension less); see [14]. See 
also the recent paper [4] for the algorithmic aspects of such connections. 

More precisely, assume that pi, . . . , p„ are independently and uniformly distributed {ab- 
breviated as iud) in a specified region S and qi, . . . , q„ are iud in the region 5 x [0, 1]. Then 
the distribution of the number of Pareto records of the sequence pi, . . . , p„ is equal to the dis- 
tribution of the number of maxima of the set {qi, . . . , q„}. This connection will be used later 
in our analysis. 

On the other hand, we also have, for any given regions, the following relation between the 
expected number E[X„] of Pareto records and the expected number E[M„] of maxima of the 
same sample of points, say pi, . . . , p„, 

l<fc<n 

see [4]. 



Dominating records. Although the Pareto records are closely connected to maxima, their 
probabilistic properties have been less well studied in the literature. In contrast, the following 
definition of records has received more attention. 

A point pfe is defined to be a dominating record of the sequence pi, . . . , p„ if 

Pi -< pfe for dXW < i < k. 

This is referred to as the strong record in [14] and the multiple maxima in [18]. 

Let the number of dominating records falling m A C 5 be denoted by Za- Goldie and 
Resnick [15] showed that 

^[Za] = [ (l-/ipx))"' d/i(x), 

J A 

where Dy^ = {y : y -< x}. They also calculated all the moments of Za and derived several 
other results such as the probability of the event {Za = 0} and the covariance Cov (Za, Zb)- 

In the special case when the p/s are iud with a common multivariate normal (non-degenerate) 
distribution, Gnedin [13] proved that 

Xn '■= P{pn is a dominating record} x n~"(logn)^"~''^/^. 

for some a > 1 and (3 E {2,3, ... , d}. For finer asymptotic estimates, see [17]. 
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Chain records. Yet another type of records of multi-dimensional samples introduced in [14] 
is the chain record 

Pi ^ Pii ^ Pi2 ^ < P*fe, 

where 1 < ii < i2 < ■ ■ ■ < ik and there are no pj >- p^^ with ia < j < ia+i or ia < j < n. See 
Figure 1 for an illustration of the three different types of records. 
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Figure 1: In this simple example, the dominating records are pi and p7 (left), the chain records 
are pi, ps and pr (middle), and the Pareto records are pi, p2, Ps, Pe and pr (right), respec- 
tively. 



Some known results and comparisons. If we drop the restriction of order, then the largest 
subset of indices such that 

P^l ^ Pi2 ^ < Pifc (1) 

is equal to the number of maximal layers (maxima being regarded as the first layer, the maxima 
of the remaining points being the second, and so on). Assuming that {pi, . . . , p„} are iud in 
the hypercube [0, 1]*^, Gnedin [14] proved that the number of chain records Yn is asymptotically 
Gaussian with mean and variance asymptotic to 

E[F„] ~ d-^ log n, N[Yn] ~ d'^ log n; 

see Theorem4 for an improvement. The author also derived exact and asymptotic formulas 
for the probability of a chain record P(F„ > Fn-i) and discussed some point-process scaling 
limits. 

The behavior of the record sequence (1) in are studied in Goldie and Resnick [16], 
Deuschel and Zeitouni [8]. The position of the points converges in probability to a (or a set 
of) deterministic curve(s). Deuschel and Zeitouni [8] also proved a weak law of large num- 
ber for the longest increasing subsequence, extending a result by Vershik and Kerov [21] to 
a non-uniform setting; see also the breakthrough paper [3]. A completely different type of 
multivariate records based on convex hulls was discussed in [20]. 

Chain records can in some sense be regarded as uni-directional Parero records, and thus 
lacks the multi-directional feature of Pareto records. The asymptotic analysis of the moments 
is in general simpler than that for the Pareto records. On the other hand, it is also this aspect 
that the chain records reflect better the properties exhibited by the one-dimensional records. 
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Interestingly, the chain records correspond to the "left-arm" (starting from the root by always 
choosing the subtree corresponding to the first quadrant) of quadtrees; see [5, 10] and the 
references therein. 

(DOMINANCE 



MAXIMA 

maximal' 
layers^ 

( depth ) 



Pereto 
records j 



RECORDS 




( chain | dominating 
records records 



Figure 2: A diagram illustrating the diverse notions defined on dominance; in particular, the 
Pareto records can be regarded as a good bridge between maxima and multivariate records. 



A summary of results. We consider in the paper the distributional aspect of the above three 
types of records in two typical cases when the p/s are iud in the hypercube [0, l]'^ and in the 
c/-dimensional simplex, respectively. Briefly, hypercubes correspond to situations when the 
coordinates are independent, while the rf-dimensional simplex to that when the coordinates are 
to some extent negatively correlated. The hypercube case has already been studied in [14]; we 
will discuss this briefly by a very different approach. In addition to the asymptotic normality for 
the number of Pareto records in the rf-dimensional simplex, our main results are summarized 
in the following table, where we list the asymptotics of the mean (first entry) and the variance 
(second entry) in each case. 



Models 

Records 


Hypercube [0, l]'^ 


(i-dimensional simplex 


Dominating records 


TT{d) Tj{d) rT{2d) 


(15), (16) 


Chain records 


^logn, ^logn [14] 




Pareto records 


^ i}ognf , + Kd+^ {}ognf [14] 




Maxima 


= Pareto records in [0, l]'^"^ 14] 





Here Hj^""^ = XlLi^ "^' ^rf ^ constant (see [1]), := j^T (i), Vd is defined in (3), 
rhd := r(i), V(i is given in (4), and both (15) and (16) are bounded in n and in d; see Figure 3. 

From this table, we see clearly that the three types of records behave very differently, al- 
though they coincide when d = 1. Roughly, the number of dominating records is bounded 
(indeed less than two on average) in both models, while the chain records have a typical loga- 
rithmic quantity; and it is the Pareto records that reflect better the variations of the underlying 
models. 
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Figure 3: The mean and the variance of the number of dominating records in low dimensional 
random samples. In each model, the expected number approaches 1 very fast as d increases 
with the corresponding variance tending to zero. 

Organization of the paper. We derive asymptotic approximations to the mean and the vari- 
ance for the number of Pareto records in the next section. Since the expression for the leading 
coefficient of the asymptotic variance is very messy, we then address in Section 3 the numerical 
aspect of this constant. The tools we used turn out to be also useful for several other constants of 
similar nature, which we briefly discuss. We then discuss the chain records and the dominating 
records. 

2 Asymptotics of the number of Pareto records 

Let 

Sd := {x : Xi > and < ||x|| < 1} 

denote the c/-dimensional simplex, where ||x|| := xi + ■ ■ ■ + Xd. Assume that pi, . . . , p„ are iud 
in Sd- Let X„ denote the number of Pareto records of {pi, . . . , p„}. We derive in this section 
asymptotic approximations to the mean and the variance and a Berry-Esseen bound for X„. 
The same method of proof also applies to the number of maxima, denoted by M„, which we 
will briefly discuss. 

Let qi, . . . , q„ be iud in Sd x [0, 1]. As discussed in Introduction, the distribution of Xn is 
equivalent to the distribution of the number of maxima of {qi, . . . , q„}. 
For notational convenience, denote by a„ ~ 6„ if a„ = 6„ + O (n^^/"') . 

Theorem 1 The mean and the variance of the number of Pareto maxima X„ in random samples 
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from the d-dimensional simplex satisfy 



0<j<d-2 
\d-l 



— — n 

d J d-l-j (2) 



where 

Vd 



+ (-ir-Mlogn + 7), 
N[X^] = {vd + o{l))n'-^l\ 

pi pi POO POO POO . . 

x// / / / y'^-^-V-V"(^+^)'-^(^+'")'(e^^'-l) dt^^dydxdMd^; 

io Jv Jo Jo Jo ^ ^ 

p1 p1 poo poo . . 

+ 2d^ J J J J ^<i-ie-"^'-^(^+'^)' (^e™' - 1 j dw dx du dv 

p1 p1 poo poo 

-2d^ / y'^-^e-"("+^)'-""'d|/dxdudt;. 



(3) 



JO Jv JO JO 

Proof. The method of proof is similar to that given in [1], but the technicalities are more in- 
volved. We start with the expected value of X„. Let Gi = l{q. is a maxima}- 

E[Xn] = nE[Gi] 

= d\n! ! (l-z(l- ||x||)'^)""' dxd^ 
Jo Jsd 

c^dln [ [ e-""(^-IW')'dxd2 
Jo J Si 
1 rl 

d 




dn / e-^'^^-y^ y'^-^dydz ^ ||x||) 



Jo 



0<j<d^ J / ^0^0 



0<j<d 

+ (logn + 7). 

This proves (2). 

For the variance, we start from the second moment, which is given by 

E [Xl] = E[X„] + n{n - 1)E [G^G^] . 

Let A be the region in R"' x [0, 1] such that qi and q2 are incomparable (neither dominating 
the other). Write qi = (x, u), q2 = (y, v), ||x||^ := (||x|| A 1) and 

xVy := (xi V?/i,-- - ,Xdyyd)- 
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Then by standard majorization techniques (see [1]) 

n{n - 1)E [G1G2] 

{n-l)d\^ [ ||x||)^-t;(l- ||y||)'^ + (MAt;)(l- ||x V y ||J'^)""' dxdydMdw 

J A 

d9 [ e-"["(i-IWI)'+^(i-lly"l)'MxdydMdt; 

J A 

+ n''d\^ [ e-"Ki-l|x||)''+.(i-||y||)1 (eniuAv)(i~\\.vy\ir _ i] dxdydwdt; 



2 

~ n 



d-, 



i<e<d 

where 

-1 rl 



Jn,o = 2n^d\' f [ [ e-"["(i-IW')'+^(i-W)'ldxdydwdt;, 

Jo Jv J 



1 /•! 



Jr,.j = n^d\'^ 




JO 



-n[«(l-||x||)'*+i,(l-|| yll)''] J^gn(«Ai>)(l-||xVy||J'* _ dxdydwdw, 



Xi<yi,£<i<d 

x,ye5d 



-1 p1 



y^x 

x,ye5d 

for 1 < £ < rf - 1. 

Consider first J„^£, 1 < £ < d. We proceed by four changes of variables to simplify the 
integral starting from 

Xi^^i, - r]i), forl<z< 

Xi h-^ - r/i), Ui H-^ ii, for i <i < d, 



which leads to 




Jo JSdJ [0,1]'^ 



{ndiy 



where J2 ■= Eti U ■= ULi E' ■= Ef=i and ^" Xi := ^ 
Next, by the change of variables 

d 



i=i+i Xi 



we have 

-1 /.I 



J^^ = cl\^ ! I ! ! ^-[n{i:<i^+Y.''m{i~di,n-^/'^)Y+v{j:^,+Y.'m{i-di.n-y'^)Y] 

Jo Jo ^Sd(n)J[0,nl/d/rf]d 

X (^e^-^-)iJ:i^r _ JJ (1 _ c/^^n-^/'^) d^drjdudv, 
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where Sd{n) = : < n^^dand ||^|| > 0}. 
We then perform the change of variables 

and obtain 

JO Jo JSd{n) J[0,nVd/d]d 

X (^e("^")(^«')' - l) d^dridudv. 
Finally, we "linearize" the integrals by the change of variables 

and get 

J /-/I 

Jn,e^ ,, .^, ri'-'/' / / / / /-^-v-v"(^+^)'-"(^+-)° 



{d - e - i)\{e - 1)\ 

X (^e^^Mo;' - 1^ dwdydxdudv, 
since the change of variables produces the factors 



^0 ^0 ^0 ^0 



and 



{d - {d - i - 1)\ {i-iy.' 

Now by symmetry, we have 



X ^-u{x+yr-v{x+wr f^^vx'^ _ dwdydxdudv. 



Proceeding in a similar manner for J„ d, we deduce that 

-l rl 



Jq Jv is^rnUfO.nVd/wld V / 



'0 Jv J Sd(n)J[0,n'^/'^/d]<i 

By the change of variables .x i-> ^ ^j, w i-^ ^ ?7i, we have 

<\2 /"l /"I /"oo /"oo 



J„ , ~ ^ n^-y^ j I r r ^d-l^-ux^-vix+v,f ( ^iuAv)x^ _ A 

{{d-iy.y ./o Jv Jo Jo ^ ^ 

= 2d^n'-^/'' 1 1 r r ^d-i^-ux^-vix+wf Lvx'^ _ {\ 

Jo Jv Jo Jo ^ ^ 

Similarly, for J„ q, we get 

J„,o = 2d!2 /77 / e-N^«^+^''')'+^(S^')1d^d77di.dT;. 

Jo Jv J SJn)J\0.n^/'i/d]<' 



10 Jv J Sd{n)J[0,n^/'^/d]'' 

The change of variables x i-^ ^^i, y h-^ J^iji then yields 

J„,o^2dV-^/W / / / /"^e-"(^+^) -''^'dydxdiid^;. 

Jo Jv Jo Jo 

This completes the proof of the theorem. I 
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Remark. By the same arguments, we derive the following asymptotic estimates for the num- 
ber of maxima in Sd- 



d 



where 



E[M„] ^ Yl ("^ ~ (-l)'r ( n('^-^-^)/'^ 

0<j<d V ^ / 

V[M„] = (^, + o(l))ni-i^ 



(Ml 



^ . u) (d - A; - 1)!(A; - 1)! 



l<fe<d 

OO POO POO 



POO POO PO^ . . 

xj J J (e"'-l) dwdydx (4) 

POO POO 

Jo Jo 

Theorem 2 The number ofPareto records in iud samples from d-dimensional simplex is asymp- 
totically normal with a rate given by 



sup 

X 



P I < X 1 - Mx) 



where denotes the standard normal distribution 
Proof. Define the region 

D„ := 



= O {n-^'^-^^'^^'^ (log nf + n-^/''(log nf'') , (5) 



X, ^) : X e ^ e [0, 1] and z {1 - < 



Let X„ denote the number of maxima in Dn and X„ the number of maxima of a Poisson process 
on Dn with intensity d\n. Then 



P 



X„ - E[Xn] 

VWnl 





< 


P 1^ 









+ 



p 



p 



Xn — E[X„ 
Xn — E[X„ 

X„ — K[X„ 

y[Xn] 



<x \ -F 



<x \ -F 



<y\ - Hy) 



Xn - nXn] 
VWnl 

'Xn - nx, 



< X 



< X 



+ my)-^x)\, 
(6) 



for X e R, where 



Y[Xn] ^E[Xn]-E[Xn 

y — Xa I — + 



nXn] 
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We prove that the four terms on the right-hand side of (6) all satisfy the 0-bound in (5). For 
the first term, we consider the probability 

P (X„ 7^ Xn) < nF (qi ^ D„ and qiis a maxima) 

= nd\ I (1-^(1- ||x||)'^)""' dxdz 

is'dX[0,l]-D„ 

<nd\ \ 1 — dxd^ 

For the second term on the right-hand side of (6), we use a Poisson process approximation 



sup 

t 



P (X, < t) - P (X, < t) I < O (|D„|) = O (n-i/"(logn)i/'^) . 



To bound the third term, we use Stein's method similar to the proof for the case of hypercube 
given in [1] and deduce that 



sup 

y 



V[X„ 



o 



(E[X„])V2Q^ 



(V[X„])3/4 
= O (logn)'), 
where Qn is the error term resulted from the dependence between the cells decomposed and 

Q„ = 0((logn)2). 
Finally, the last term in (6) is bounded above as follows. 

/ 



!$(?/) - ^{x)\ = o 



VV[xJ-a/v[x„] 



E[X„] - E[X, 



V[X„ 



Q ^^-{d+l)/(2d)^ 



This proves (6). I 



Remark. By defining 



Dn := <^x:xG5,and (1 - llxll)'* < 



2 logn 



n 



instead and by applying the same arguments, we deduce the Berry-Esseen bound for the number 
of maxima in iud samples from Sd 



sup 



10 



3 Numerical evaluations of the leading constants 



The leading constants Vd (see (3)) and (see (4)) appearing in the asymptotic approximations 
to the variance of X„ and to that of M„ are not easily computed via existing softwares. We 
discuss in this section more effective means of computing their numerical values to high de- 
gree of precision. Our approach is to first apply Mellin transforms (see [9]) and derive series 
representations for the integrals by standard residue calculations and then convert the series in 
terms of the generalized hypergeometric functions 

r(/3i) ■ ■ ■r(/3,) r(j + «i) ■ --vu + a^) 



q\Otl^ . . . , ttp, Pl, . . . , Pq, Z) . , , , , / ^ 



The resulting linear combinations of hypergeometric functions can then be computed easily to 
high degree of precision by any existing symbolic softwares even with a mediocre laptop. 

The leading constant Vd of the asymptotic variance of the (i-dimensional Pareto records. 

We consider the following integrals 



d\ {d-i)\ 



m J {m — l)\{d — 1 — m)\ 



l<m<d 

moo /"oo roc ^ . 

/ / yd-^-m^m-l^-uix+yr-v(x+n,r /vx'^ _A ^^^y^^^^^^ 

Jo Jo ^ ^ 

J J J (^''^' ~ ^^^^^^^^ 

p1 p1 poo poo 

Jo Jv Jo Jo 

=:{d-l) V ('^)('^~]]ld,m + Id,d-Id,o. 

Then Cd is related to Vd by Vd = ^^r(i) + 2d'^Cd- We start from the simplest one, Idfl and use 
the integral representation for the exponential function 

e"* = / r(s)rMs, 



where c > 0, > and the integration path /^^^ is the vertical line from c — zoo to c + ioo. 
Substituting this representation into Id,o, we obtain 

^7r« Jo Jv Jo Jo 
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Making the change of variables y ^ xy yields 

-l /"I /"OO /"CO 



Idfi^TT^ / / / / «-V(i-'')(l + Z/)-V"V^"'dyda:d«d^;ds 

J(c) Jo Jv Jo Jo 



'(c) 

rfr(rf-i) /• r(,s)r(r/,s - (/)r(i + i - s) 



ds 

27ri r((is)((is- 1) 

where 1 < c < 1 + 1. Moving the integration path to the right, one encounters the simple poles 

at s = 1 + 1 + j for j = 0, 1, Summing over all residues of these simple poles and proving 

that the remainder integral tends to zero, we get 

where the terms converge at the rate This can be expressed easily in terms of the 

generalized hypergeometric functions. 

An alternative integral representation can be derived for Id^o as follows. 

'''' m ^ r(i + 2) ^ i 



which can also be derived directly from the original multiple integral representation and suc- 
cessive changes of variables (first u, then x, then v, and finally y). In particular, for d — 2, 

l2,o = (\/2 - 1 + log2 - log(\/2 - 1)) . 

Now we turn to Id,d- 

p1 p1 poo /'OO 

Id,d = J J J J w'^"'e-"^'-^(^+"')' (^e^^' - ij dwdxdwd^;. 
By the same arguments used above, we have 

_T{d) f r(s)r(ds - d)r(i + 1 - ^) , 

^'''~2dniJ^,^ r{ds) ^'^'''^'^ 

where c > 1 and 

Id,d I' I' ((« - vy-'"^ - u'-'"^) du dv. 
To evaluate assume first that ^ < 3fJ(s) < 1, so that 

rl rl—v pi ru 

I'^^= I I u'~^--ddu- I u'-^--i I v-'dvdu 







d /r(i-s)r(s-i) 



d-i\ r(i-i) i-s 
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Now the right-hand side is well-defined for | < 3(?(s) < 2. Substituting this into Id,d, we obtain 

_ r(ci-i) f T{s)T{ds - d)T{i + 1 - / r(i-.)r(.-i) _ ^\ 
27ri 4 r(d.) 1, r(i-i) 1-.;^"' 

where 1 < c < 1 + ^. For computational purpose, we use the functional equation for Gamma 
function 

r{i-s)r{s) = -^, 

smiTS 

so that 

_ T{d-l) f nr{ds-d) ( TT V{s-l) \ 

2m 4 V{ds) sin(7r(s - i)) Vr(l - \) smins) ^ r(ds)r(s J ' 

In this case, we have simples poles at s = j + 1 /c? for both integrands and s — j for the first 

integrand to the right of = 1 for j = 2,3, Thus summing over all the residues and 

proving that the remainder integral goes to zero, we obtain 

V - r(d - m,) 1^ - r(d - 1) 1^ rwrw + i) ' 

A similar argument as that used for 1^ o gives the alternative integral representation 
In particular, for d = 2, 

72,2 = V^(2- V2-21og2 + log(V2 + l)) ■ 
Now we consider Id,m for 1 < m < d. 

/*1 pi f'OO /*OC f^CO 

h,m-^ / / / / /-'-"'w"*-'e-"(^+^)'-"(^+^)' (e"^' - 1) dwdydxdud^;, 

JO Jv Jo Jo Jo ^ ^ 

which by the same arguments leads to 



X 



dr(d-m) /■ r(5)r(d5 -d + m)r(i + 1 - 5) ^^^^^^^^ 



2{d-l)m J^^) r{ds){ds-l) 



where 1 < c < 1 + | and 



Wm{s) := y w'"-^ (^((1 + w)'^' - 1)'"^"^ - (1 + w)'^'-'^-^j dw 



^okm"^ ^ r ' Vr(l-^)r(. + ^)sin(7r(. + S) 1 
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for I < 3?(s) < 2 — (m — Note that each term has no pole at s = 1 — |. Thus 

_ r{d-m) ^ 



d _ 

Q<i<m 



where 

h 



-I 



V{s)V{ds -d+m)T{l + \-s) 
27ri J(c) V[ds){ds — 1) 



We then deduce that the integral equals the sum of the residues at s = j + ^ and s — j + 1 — ^ 



r(^) ^ r(j + i + i)r(rfj + m + i) 



d ^(j + i)r(rfj + rf+i)r(j + i + ^) 
ipr(j + i + i)r(rfj + m + i) 



.^^i!r(dj + d-^)(di + d-^-i) 



7-[l] , 7-[2] , 7-[3] 



It follows that 

Cd — IdA + Idfl 



l<m<(i V / ^ / 
_ U\ {d-2)\ ^ (^-^\(_^^m-l-^(J[l] ji2] jlS] \ 

l<m<d ^ / ^ ^' 0<e<m ^ ^ 

=:Cf+cf+cf. 
For further simplification of these sums, we begin with cf \ Note first that 



d,m,i 



o<e<m 

_i {-iyr{j + i + l)r{dj + m + i) ^ /m-iV i 

^ ^ ^ ij + midj+d+i) ■ 
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Thus 



d,Tn,i 

l<m<d ^""^ 0<i<m 



l<m<d ^ ^ ^ ' 0<£<m ^ ' 

l<m<d ^ ^ i>l 

Accordingly, C^^' = for odd values of d. 

For the other two sums containing /^^j^j^ and l^y^p we use the identity 

^ (iV + m)!(-l)™ _ (-l)'^A^! ((N^\^e\ [N + d 



^^^^^m\{d - m)\{m - I - ^)\ {d - I - E)\ \\ d J \ d 
Then 

[!]_ ^ ( d\ {d-2)\ ^ [1] 
r(rfj + m + 



X 



E 



m!((i — m)!(m — 1 — £)! 



Note that 

o{r') {o<i<d-2), 



for large j, so that the series is absolutely convergent. 
Similarly, 



W3]_ V- f d\ id-2)\ ^ f^-A(,-.m-i-erm 

l<m<d ^ / ^ ^ 0<£<m ^ ^ 



J>1 



Since Vd — ^r(i) + 2d Cd, we obtain, by converting the series representations into 
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hypergeometric functions, the following approximate numerical values of Vd. 

V2 ^ 2.86126 35493 11178 82531 14379, 

V3 ^ 3.22524 36444 05576 89660 59392, 

V4 ^ 3.9779727442 19455 29292 64760, 

V5 ^ 4.84527 39171 62611 42226 50057, 

ve ^ 5.76349 95321 96568 64812 77416, 

V7 ^ 6.70865 12250 86590 36364 34742, 

vs ^ 7.66955 04435 24665 04704 24808, 

vg ^ 8.64032 79742 082872493100067, 
vio ^ 9.61764 75521 13755 73944 20940, 
vu ^ 10.59949 78766 5695163098 76869, 
vi2 ~ 11.58460 78314 60409 77794 37163. 

In particular, f 2 has a closed-form expression 

V2 = ^v^(27r2-9 - 121og2) . 

The leading constant Vd of the asymptotic variance of the d-dimensional maxima. Let 



00 POO 

d 



Jo Jo 



and 



Then (see (4)) 

Vd = \ ] + > , 1,1 Jd,k - Jd,o- 



^ ^ l<k<d ^ ^ 



Consider first o- By expanding (1 + x'^) ^ d, interchanging and evaluating the integrals, we 
obtain 



Jds) — 2r 



i\ /"^ d -X 



f r dx 

(l + x'^)i+^ 



|^r(j + i)r(dj + rf + i)^ ^' 
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the general terms converging at the rate 0{j d). The convergence rate can be accelerated as 
follows. 



j,,o = 2r (^1 + 2' x-^-\i - x-^Y-\i + x)-^--^ dx 
= 2r ("i + ^) V 2-'-^-3 !\l - xYx^-\i - x^f-' 



dx 



0<£<d 



the convergence rate being now exponential. In terms of the generalized hypergeometric func- 
tions, we have 



J. 



d,0 



' o<e<d ^ / ' ^ 



The integrals Jd,k can be simplified as follows. 

= d'{d- 1) - ^) J^^ie^" - 1) J^^ e-y' 

poo 

X / {y - xY'''^~^{z - xy^e'^'^ dzdydx 

X / {e''' - l){y - xY~'^~'^ {z - xY dx dz dy 
Jo 

= 2{d - i)r (^^ ~ J\l - x)' J\i - xzY-^-'^z^^^ 

\{l + z'^-x'^z'^f+-d {l + zdf+-d) 
By the same proof used for o> we have 



X 



[\l - Xz)'^-^-''z''+\l + dz dx 

Jo 
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Similarly, 



2r(i) (.-!)! E 



X 



E 



j!(rf-2-A;-j)! 



Thus we obtain the following numerical values for the limiting constant Vd of V[M„] /n*^'^ ^^Z*^ 

V2 ^ 0.68468 89279 50036 17418 09957, 

V3 ^ 1.482173187340583 6860111369, 

V4 ^ 2.35824 37612 02486 93742 28054, 

V5 ^ 3.27773 90059 79491 26684 80858, 

ve ^ 4.22231 09450 77067 79998 34338, 

V7 ^ 5.18220 7668616078 4851729967, 

^8 ~ 6.15196 29023 7747445508 28039, 

vg ^ 7.12835 13658 43360 52793 29089, 
vio ^ 8.10938 23221 15849 82527 77117, 
^11 ^ 9.09377 7469786680 89694 70616, 
vi2 ^ 10.0806 86465 19733 08113 16376. 

In particular, V2 = V^(2 log 2 — 1); see [2]. 

Yet another constant in [6]. A similar but simpler integral to (4) appeared in [6], which is of 
the form 

moo 
{U + ^Y-2^-iu+.f+.'^-i^+.r da; du dw, 

(this Kd is indeed their Kd-i). By Mellin inversion formula for e~*, we obtain 

-I /• /"OO roo roo 

Kd = / T{s) / / {u + wY'^{u + x)-'^' 6-^'"+'''^'+''' dx dudwds 

27r2 J(c) Jo Jo Jo 



1 



2dTii 



X 



T{s)T{l + h-s 



OO POO 




JO 



{u + wY-\l + uy^' {{1 + w)"^ - ly ' ''dudwds. 
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Expanding the factor {u + wY ^, we obtain Ka = Z)o<m<d-2 {'^J)Kd,m, where 

nl- [ r(s)r(l + i-s)5(m+l,ds-m-l)t/^(s)ds. (7) 



Here 



POO 

Um{s) := / ((1 + ^i;)'^ - l)''"^"^ dw 

Jo 

.1 -I (H-i) 



d— 2— m 

dt 



1 ^ /d-2-m\, ,,d_2-m- 



0<e<d~2~m 



Thus we obtain 



0<m<d-2 ^ ^ 0<e<d-2-m ^ ^ 

'r(j + i-|)r(o(7- + d-£-m-i) r(j + i + i)r(rfj + d-m) 




i!r(dj + d-£) r(j + i + ^)r(rfj + rf+i 

This readily gives, by converting the above series into hypergeometric functions, the numerical 
values of the first few K^, 

K2 ^ 0.30714 28473 56944 02518 48954, 

K3 ^ 0.21288 24684 73220 99693 80676, 
^ 0.19494 67028 2303318190 40460, 

K5 ^ 0.20723 21512 9967145854 93769, 

Kg ^ 0.24331 17024 51836 72554 88428, 

Ky ^ 0.30744 56566 07893 22242 37300, 

Ks ^ 0.4112701058 90385 83873 59349, 

Kq ^ 0.57571 68456 67243 64328 08087, 
Kio ^ 0.83615 82236 77116 00233 16115, 
7^11 ^ 1.25179 63251 14070 86480 31485, 
K12 ^ 1.9220104035 1884736012 85304. 

These are consistent with those given in Chiu and Quine (1997). In particular, K2 = j\/n\og 2. 
Further simplification of this formula can be obtained as above, but the resulting integral ex- 
pression is not much simpler than 

rfi) /•! / 1 1 x'^-^ 1 1 



^0 
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4 Asymptotics of the number of chain records 



We consider in this section the number of chain records of random samples from rf-dimensional 
simplex; the tools we use are different from [14] and apply also to chain records for hypercube 
random samples, which will be briefly discussed. For other types of results, see [14]. 

4.1 Chain records of random samples from (i-dimensional simplex 

Assume that pi, . . . , p„ are iud in the rf-dimensional simplex Sd- Let F„ denote the number of 
chain records of this sample. Then F„ satisfies the recurrence 

F„^l + y,„ {n>l), (8) 

with Yq := 0, where 

P(J„ = k) = 7r„,fc = d(^'^^^ j'^ t'\l - t'r-'-\l - tY~' dt, 
for < < n. An alternative expression for the probability distribution 7r„ ^ is 



7r„ 



which is more useful from a computational point of view. 
Let 

{z + l)---{z + d)-d\=z \[{z- A^), 

l<i<d 

where the A^'s are all complex M), except when d is even (in that case, —d — 1 is the unique 
real zero among {Ai, . . . , \d-i})- Interestingly, an essentially the same equation also arises in 
the analysis of random increasing fc-trees; see [7]. 

Theorem 3 The number of chain records Yn for random samples from d-dimensional simplex 
is asymptotically normally distributed in the following sense 



sup 



P < xV ^{^) = O ((logn)-i/2) , (9) 

V cr^Vlogn J 



where fig ■= l/{dHd) and ag := -y/ H^^/ (dH^). The mean and the variance are asymptotic to 

nyn] = ^+ci+o{n-'), (10) 

Wn] = ^Hr. + C2 + 0(n-^), (11) 
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respectively, where 

CiHP 2d\ ^ {dj + 1) ■ ■ ■ {dj + - H^,) 

Hj H,j^^ ^^dj + i)---idj + d)~d\y • 

The error terms in (10) and (11) can be further refined, but we content ourselves with the current 
forms for simplicity. 

Expected number of chain records. We begin with the proof of (10). Consider the mean 
fXn ■= ^Yn]. Then /io = and, by (8), 

Hn= 1+ ^ TTnMfJ'k > 1)- (12) 

0<fc<n 

Let f{z) := e"^ Z]n>o f^nz'^/nl denote the Poisson generating function of /i„. Then, by (12), 

f{z) + f'{z) = i + d f f{t''z){i - ty-' dt. 

Let f{z) = J2n>o f''nZ"'/n\. Taking the coefficients of on both sides gives the recurrence 

d\ _ , 

/i„ + /X„+l = — — — — TT/^n {n>l). 

[dn + 1) ■ ■ ■ [dn + d) 
Solving this recurrence using /ii = 1 yields 

It follows that for n>l 

This is an identity with exponential cancelation terms; cf. [14]. In the special case when d = 2, 
we have an identity 

Hn + 2 

No such simple expression is available for d > 3 since there are complex-conjugate zeros; see 
(14). 
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Exact solution of the general recurrence. In general, consider the recurrence 

0<fc<n 

with ao = 0. Then the same approach used above leads to the recurrence 

= " " {dn + iy..{dn + d)) ^- + ^- + 

which by iteration gives 

by defining 60 = = 0. Then we obtain the closed-form solution 

\dk. 



an= Yl 

l<k<n 

A similar theory of "d-analogue" to that presented in [10] can be developed (by replacing 
there by d\/{{dj + 1) ■ ■ ■ {dj + d))). 

However, this type of calculations becomes more involved for higher moments. 

Asymptotics of We now look at the asymptotics of To that purpose, we need a better 
expression for the finite product in the sum-expression (13). 

In terms of the zeros A/s of the equation (z + 1) ■ ■ ■ (z + d) — d\, we have 



n 

l<j<n 



d\ \ Ui<j<n {dj Ui<e<didj - Xi)) 



{dj + !)••• {dj + d)J Ui<,<n iidj + 1) ■ • ■ {dj + d)) 

^ 1 n ^(^-T)r(i + ^) (14) 
%4ir(n + i)r(i-^) 

=: (f){n). 

The zeros Aj's are distributed very regularly as showed in Figure 4. 

Now we apply the integral representation for the n-th finite difference (called Rice's inte- 
grals; see [1 1]) and obtain 



. 1 



1 r^^'" r(n + i)r(-£) 



2 '■'^ 



Note that 0(s) is well defined and has a simple pole at s = 0. The integrand then has a double 
pole at s = 0; standard calculations (moving the line of integration to the left and summing the 
residue of the pole encountered) then lead to 



dH, 
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Figure 4: Distributions of the zeros of [z + 1) ■ ■ ■ {z + d) — d\ = Ofor d = 3, ... ,50. The zeros 
approach, as d increases, to the limiting curve \z~^{z + = 1 (the blue innermost curve). 

where the 0-term can be made more explicit if needed. Here Hn = J2i<j<n Vj denotes the 
harmonic numbers, and ^jj{z) denotes the derivative of log T{z) . Note that to get this expression, 
we used the identity 

{z + l)---{z + d) -d\ _ d\T{z + d-j + l) 

~z " {d-j + iy.r{z + iy 

The probability generating function. Let Pn{y) ■= IE[y^"]- Then Po{y) = 1 and for n > 1 

Pniy) = y ^ri,kPkiy)- 

0<k<n 

The same procedure used above leads to 

0<k<n ^ ^ 0<j<k 



(dj + !)••• (dj + d) 



l<j<k 

Let now \y — 1| be close to zero and 

{z + l)---{z + d)-d\y= W {z-\,{y)). 



l<E<d 



Note that the A^'s are analytic functions of y. Let Xdiy) denote the zero with Ad(l) = 0. Then 
we have 

y-1 f'-'+'^Tin + l)Ti-s) 



Pniy) = 1 - ^r— / -f^T — — ^ y) ds, 



23 



where 



T{s + i)T (i-^) i</<.r(. + ^)r(i-Mi) 



Note that for y ^ 1, 0(0, y) = 1 — y. When y ~ 1, the dominant zero is Xdiy), and we then 
deduce that 

Pn{y) = Q{y)n''^'^/'' + 0{\l-y\n-'), 

where 

p f ^d{y)-^i{y) \ r h I ^"i 

g(,) ■= - ^) n 

A.(,)r(i + Me)) ^11^ r [^) r (i - Me)) • 

By writing (2; + 1) ■ ■ ■ (2; + c?) — dly = as 

and by Lagrange's inversion formula, we obtain 

A.(l/) = V - - 1)^ + O (I, - . 

From this we then get Q{1) = 1 + 0{\y — 1|) and 

Ue )-Jf^ + ^^V V + 0{\v\ ), 

for small |r|. This is a typical situation of the quasi-power framework (see [12, 19]), and we 
deduce (10), (11) and the Berry-Esseen bound (9). The expression for C2 is obtained by an 
ad-hoc calculation based on computing the second moment (the expression obtained by the 
quasi-power framework being less explicit). 

When d = 2, a direct calculation leads to the identity 

Y\Y] = ^H +^ + ^_^_2^^2,-l 2j 

V[rnJ 97^"+ 97 ^ Q 97 Q 



for n > 1, which is also an asymptotic expansion. This is to be contrasted with E[F„] = 
(i/„ + 2)/3. 

4.2 Chain records of random samples from hypercubes. 

In this case, we have, denoting still by F„ the number of chain records in iud random samples 
from [0, 1]'^, 



with Yn = and 



y, = l + F,„ (n>l), 
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Let Pn{y) ■= lE[l/^"]- Then the Poisson generating function ^{z, y) := e ^ J2n>o Pn{y)z"'/nl 
satisfies 

^{z,y) + -^{z,y) = yj^ ^(t^)L_AJ_ dt, 
with ^{0,y) = 1. We then deduce that 



l<fc<n ^ ^ 1<7< 



<j<k ^ ■' 



Consequently, by Rice's integral representation [11], 



p ^ r^+'+'°° r(n + i)r(-g) r(g + 1 - yVd^-^i^i/d^ 

''^y^-2mj,_,_,^ T{n + l- s)T{s + l)-^ r(l - y W^-/^) 

If ly — 1| is close to zero, we deduce that 

Pniy) = Y(^yi/dy/d 11 r(l - yi/rfe2^W'i) ^ ' 

A very similar analysis as above then leads to a Berry-Esseen bound for F„ as follows. 

Theorem 4 The number of chain records Ynfor iud random samples from the hypercube [0, 1]' 
satisfies 



sup 



0((logn)-i/2) 



(ThVlogn 

where fih = o^h '■= ^/d- The mean and the variance are asymptotic to 



d d 

l<£<d 

v|r„] = -L,og„ + 2-^ 



l<£<d 

The asymptotic normality (without rate) was already established in [14]. 
In the special case when d = 2, more explicit expressions are available 

EM = ^^, V[F.] = ^^^±^^, 

for n > 1. 
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5 Dominating records in the ti-dimensional simplex 



We consider the mean and the variance of the number of dominating records in this section. 

Let Zn denote the number of dominating records of n iud points pi, . . . , p„ in the d- 
dimensional simplex Sd- 

Theorem 5 The mean and the variance of the number of dominating records for iud random 
samples from the d-dimensional simplex are given by 

l<fc<n ^ ^ 

nZn]=2 r(rfA: + 1)^^-1+ ^ V{dk + l)~\ ^ V{dk + l) ] ' ^^^^ 

2<k<n ^ ' l<k<n ^ ' \l<k<n ^ ' / 



respectively. The corresponding expressions for iud random samples from hypercubes are given 

byH^n 

Proof. 



by Hn^ and Hn^ — Hn'^\ respectively. 



E[Z„] = P (pfc is a dominating record) 

l<k<n 

/,.^„ JSa \i<j<d / 

(d!)^ -pr r(A:)r(jA; + 1) 



l<A:<n \i<i<d 

E l"0^ TT 
k A A r((j + 1)A; + !)■ 

l<fc<ri l<j<d ^^-^ ' ^ ' ^ 



Thus, we obtain (15). For large n and bounded d, the partial sum converges to the series 

at an exponential rate. For large d, the right-hand side is asymptotic to 

E[Z^] = l + 0(^^^=l + 0{A-'Vd), 

by Stirling's formula. 
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Similarly, for the second moment, we have 

E[Z^] — E[Z„] = 2 P (pj and are both dominating records) 

2<k<n l<j<k 



2E Ew/ / (n!'.)"'(n-.)""''^^'^" 



2<k<n l<j<k 



2<k<n 

r(rfA; + l) 

2<k<n ^ ' 

and we obtain (16). 

For large n, the right-hand side of (16) converges to 



^^T{dk + 1) ''-^ f^^Tidk+1) yf^^T{dk + l] 



k>2 



at an exponential rate, which, for large d, is asymptotic to Sy/ndA'^. This explains the curves 
corresponding to Z„ in Figure 3. 

The proof for the dominating records in hypercubes is similar and omitted. I 
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